-
Notifications
You must be signed in to change notification settings - Fork 25
Fix aliasing #125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix aliasing #125
Conversation
📝 WalkthroughSummary by CodeRabbit
WalkthroughRefactors buffer alias tracking and related APIs: renames and reimplements ancestor-liveliness checks to an alias-based traversal, converts Changes
Sequence Diagram(s)sequenceDiagram
participant MMG as MemoryManagementGeneration
participant VB as VariableBuffer
participant CTX as NetworkContext
rect #E8F6FF
MMG->>VB: Check deallocation eligibility (buffer, ctxt)
VB->>VB: has_live_aliases(ctxt) -- BFS across aliases
alt aliases are live
VB-->>MMG: True (do not deallocate)
else no live aliases
VB-->>MMG: False (safe to deallocate)
end
end
sequenceDiagram
participant Parser as ReshapeParser
participant Template as ReshapeTemplate
participant VB as VariableBuffer
Parser->>Template: parseNodeCtxt → operatorRepresentation (map inputs→data_in, outputs→data_out)
Template->>VB: assert bufferIn and bufferOut are VariableBuffer
Template->>VB: bufferIn.aliases += bufferOut.name
Template->>VB: bufferOut.aliases += bufferIn.name
Note right of VB: bidirectional alias linkage established
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
Deeploy/DeeployTypes.py (2)
327-349: Bug: visited initialized as set(self.name) (set of characters), not a set of names.This can degrade traversal correctness and performance for alias cycles. Initialize with the buffer name itself.
- visited = set(self.name) + visited = {self.name} - while len(queue) > 0: - next = queue.pop() - buffNext = ctxt.lookup(next) + while len(queue) > 0: + next_name = queue.pop() + buffNext = ctxt.lookup(next_name) assert isinstance(buffNext, VariableBuffer) live |= buffNext._live - visited.add(next) - queue |= buffNext.aliases - visited + visited.add(next_name) + queue |= (buffNext.aliases - visited)
946-979: Make hoistConstant robust to np.ndarray-valued constants.The method currently assumes a gs.Constant object, but line 2462 in DeeployTypes.py passes
x.attrs['value']directly, which may be a numpy array. This will crash when accessing.outputsand.name. Wrap numpy arrays intogs.Constantbefore processing, and update the signature to acceptUnion[gs.Constant, np.ndarray].- def hoistConstant(self, - constant: gs.Constant, - name: Optional[str] = None, - _type: Optional[Type[Pointer]] = None) -> str: + def hoistConstant(self, + constant: Union[gs.Constant, np.ndarray], + name: Optional[str] = None, + _type: Optional[Type[Pointer]] = None) -> str: """Register a ConstantBuffer extracted directly from a graphsurgeon Constant Parameters ---------- - constant : gs.Constant - graphsurgeon.Node containing a single constant output + constant : Union[gs.Constant, np.ndarray] + graphsurgeon.Node containing a single constant output, or a numpy array name : str Name of the ConstantBuffer to be registered _type : Optional[Type[Pointer]] Optional type assignment of the registered ConstantBuffer Returns ------- str Returns the name of the newly registed ConstantBuffer """ + # Normalize to a gs.Constant (GraphSurgeon tensor) + if isinstance(constant, np.ndarray): + constant = gs.Constant(name if name is not None else "CONST", constant) assert len(constant.outputs) <= 1, f"Constant {constant.name} has more than one output" name = name if name is not None else constant.name
🧹 Nitpick comments (2)
Deeploy/Targets/Generic/Parsers.py (1)
1023-1027: Use zip(..., strict=True) to enforce arity assumptions.parseNode already constrains IO counts; strict makes this explicit and prevents silent truncation if assumptions change.
If Python >= 3.10 in your runtime, apply:
- for tensor, symName in zip(node.inputs, ['data_in', 'shape']): + for tensor, symName in zip(node.inputs, ['data_in', 'shape'], strict=True): self.operatorRepresentation[symName] = ctxt.lookup(tensor.name).name - for tensor, symName in zip(node.outputs, ['data_out']): + for tensor, symName in zip(node.outputs, ['data_out'], strict=True): self.operatorRepresentation[symName] = ctxt.lookup(tensor.name).nameDeeploy/Targets/Generic/Templates/ReshapeTemplate.py (1)
28-36: Good: explicit, bidirectional alias linkage. Add self-guard.Avoid self-edges if data_in == data_out.
- # Link aliases to each buffer - bufferIn.aliases.add(bufferOut.name) - bufferOut.aliases.add(bufferIn.name) + # Link aliases to each buffer (skip self-aliasing) + if bufferIn.name != bufferOut.name: + bufferIn.aliases.add(bufferOut.name) + bufferOut.aliases.add(bufferIn.name)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
Deeploy/CommonExtensions/CodeTransformationPasses/MemoryAllocation.py(1 hunks)Deeploy/DeeployTypes.py(7 hunks)Deeploy/Targets/Generic/Parsers.py(1 hunks)Deeploy/Targets/Generic/Templates/ReshapeTemplate.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
Deeploy/Targets/Generic/Templates/ReshapeTemplate.py (1)
Deeploy/DeeployTypes.py (5)
NetworkContext(508-1020)NodeTemplate(87-229)VariableBuffer(232-360)lookup(720-752)add(681-718)
Deeploy/CommonExtensions/CodeTransformationPasses/MemoryAllocation.py (1)
Deeploy/DeeployTypes.py (1)
has_live_aliases(327-349)
Deeploy/Targets/Generic/Parsers.py (1)
Deeploy/DeeployTypes.py (3)
inputs(2503-2520)lookup(720-752)outputs(2522-2539)
Deeploy/DeeployTypes.py (2)
DeeployTest/testUtils/dmaUtils.py (1)
size(72-73)Deeploy/AbstractDataTypes.py (3)
Pointer(303-388)PointerClass(536-559)VoidType(121-127)
🪛 Ruff (0.14.0)
Deeploy/Targets/Generic/Parsers.py
1023-1023: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
1025-1025: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
Deeploy/DeeployTypes.py
241-241: Do not use mutable data structures for argument defaults
Replace with None; initialize within function
(B006)
🔇 Additional comments (2)
Deeploy/DeeployTypes.py (1)
241-261: Fix for mutable default argument looks good.Switching to aliases: Optional[List[str]] = None and materializing a set resolves the shared-list bug and improves semantics.
Deeploy/CommonExtensions/CodeTransformationPasses/MemoryAllocation.py (1)
144-151: Deallocation guard updated to has_live_aliases — good alignment with new alias model.Ensure alias links are symmetric for all aliasing ops (as done in ReshapeTemplate) so this guard remains sound.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
CHANGELOG.md (1)
7-7: LGTM! Entries are accurate and appropriately categorized.The changelog entries align well with the PR objectives (removal of
fromVariableBuffer,hoistConstantrefactor,TransientBuffer.__init__refactor, and aliasing bug fix). The terse style is consistent with other entries in the file.Consider optionally expanding line 90 ("Fixed aliasing") with a brief note about the root cause (e.g., "mutable default argument causing shared state") for clarity to future readers, though the current entry is acceptable as-is.
Also applies to: 77-79, 90-90
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
CHANGELOG.md(3 hunks)
🧰 Additional context used
🪛 LanguageTool
CHANGELOG.md
[grammar] ~77-~77: There might be a mistake here.
Context: ...d of generateFunction(...). - Removed fromVariableBuffer - Refactored hoistConstant - Refactored ...
(QB_NEW_EN)
[grammar] ~78-~78: There might be a mistake here.
Context: ...moved fromVariableBuffer - Refactored hoistConstant - Refactored TransientBuffer's __init__ ...
(QB_NEW_EN)
🔇 Additional comments (1)
CHANGELOG.md (1)
77-78: Grammar warnings are false positives.The static analysis hints flag lines 77–78 as potential grammar issues, but the Markdown formatting (backticks around identifiers, bullet-point dashes) is correct and consistent with the rest of the file. These warnings appear to be tool parsing artifacts.
Xeratec
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the refactor!
|
I took the freedom to rebase this PR after merging #98. I will merge it once the tests succeed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
Deeploy/DeeployTypes.py (1)
327-349: Critical bug: incorrect initialization of visited set.Line 341 incorrectly initializes
visitedasset(self.name). Sinceself.nameis a string, this creates a set of individual characters rather than a set containing the string. This breaks the BFS traversal because:
- Line 347 adds buffer names (strings) to
visited- Line 348's set subtraction
buffNext.aliases - visitedwon't work correctly- Could cause infinite loops if there are cycles in the alias graph
Apply this diff:
- visited = set(self.name) + visited = {self.name}
🧹 Nitpick comments (1)
Deeploy/Targets/Generic/Parsers.py (1)
1070-1074: Clean simplification that improves readability.The refactored
parseNodeCtxtis more concise and maintainable. SinceparseNodealready validates the input/output counts, the zip operations will always have matching lengths.Optionally, consider adding
strict=Trueto thezip()calls for defensive programming (requires Python 3.10+):- for tensor, symName in zip(node.inputs, ['data_in', 'shape']): + for tensor, symName in zip(node.inputs, ['data_in', 'shape'], strict=True): self.operatorRepresentation[symName] = ctxt.lookup(tensor.name).name - for tensor, symName in zip(node.outputs, ['data_out']): + for tensor, symName in zip(node.outputs, ['data_out'], strict=True): self.operatorRepresentation[symName] = ctxt.lookup(tensor.name).nameThis would catch any unexpected length mismatches early, though it's not strictly necessary given the validation in
parseNode.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
CHANGELOG.md(3 hunks)Deeploy/CommonExtensions/CodeTransformationPasses/MemoryAllocation.py(1 hunks)Deeploy/DeeployTypes.py(7 hunks)Deeploy/Targets/Generic/Parsers.py(1 hunks)Deeploy/Targets/Generic/Templates/ReshapeTemplate.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- Deeploy/Targets/Generic/Templates/ReshapeTemplate.py
- Deeploy/CommonExtensions/CodeTransformationPasses/MemoryAllocation.py
- CHANGELOG.md
🧰 Additional context used
🧬 Code graph analysis (2)
Deeploy/DeeployTypes.py (3)
Deeploy/MemoryLevelExtension/NetworkDeployers/MemoryLevelDeployer.py (1)
lookup(74-75)DeeployTest/testUtils/dmaUtils.py (1)
size(72-73)Deeploy/AbstractDataTypes.py (3)
Pointer(303-388)PointerClass(536-559)VoidType(121-127)
Deeploy/Targets/Generic/Parsers.py (1)
Deeploy/DeeployTypes.py (3)
inputs(2503-2520)lookup(720-752)outputs(2522-2539)
🪛 Ruff (0.14.1)
Deeploy/DeeployTypes.py
241-241: Do not use mutable data structures for argument defaults
Replace with None; initialize within function
(B006)
Deeploy/Targets/Generic/Parsers.py
1070-1070: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
1072-1072: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
🔇 Additional comments (3)
Deeploy/DeeployTypes.py (3)
241-241: LGTM! Mutable default argument bug fixed correctly.The change from
alias_of: Optional[List[str]] = []toaliases: Optional[List[str]] = Nonefixes the critical bug where all buffer instances shared the same alias list. The internal storage change toSet[str]is also appropriate since alias order is irrelevant and uniqueness is desired.Also applies to: 260-260
369-374: LGTM! Clean refactor using super().init.The refactored
TransientBuffer.__init__properly delegates to the parent class initializer while correctly settingshapeas a tuple(size,)and maintaining thesizeattribute for backward compatibility.
946-978: LGTM! Clean refactor to work directly with gs.Constant.The refactored
hoistConstantmethod is cleaner and more direct:
- Takes
gs.Constantdirectly instead of extracting from a node- Creates
ConstantBufferdirectly from the constant's values and shape- Properly handles optional type annotation
- The assertion on line 967 enforcing at most one output is a reasonable constraint
The comment about shape needing to be a tuple for pickling is helpful context.
Current aliasing is basically a bug. Instead of actually recording for each buffer who it's aliases are, it records all the aliases to all the buffers due to this line:
Did you spot the bug?
_Solution_
Ding! Ding! Ding! it's the alias_of initialization:
Every buffer got the same instance of a list, which meant every time any buffer added something to it's
alias_ofit got added to all of them.This PR gives a quick, not-super-optimal, but working, solution to the problem by implementing the aliasing information basically as a doubly-linked list. On a reshape, both input and output record each others name, and to check whether an alias is alive, we just traverse the aliasing double-linked list in all the directions with a bfs.
Imo for our current cases this is good enough performance-wise since there usually aren't so many aliasing chains happening, so I don't want to invest more time in the performance aspect.
As a bonus, I sprinkled a few refactors in there, the removal of bitrotten
fromVariableBuffers, refactor ofhoistConstantto not usefromVariableBuffer(be careful with those shapes!), and refactor of TransientBuffer's__init__function to use its super's one (reuse and all that).Changed
fromVariableBufferhoistConstant__init__Fixed
PR Merge Checklist
develcommit and pointing todevel.CHANGELOG.mdfile has been updated.